Deprecate `get_rr_node_indices()` functions #1805

tangxifan · 2021-07-19T02:37:16Z

Description

This PR focuses on updating routing resource graph builder functions, where we use the refactored data structure RRGraphBuilder to replace the legacy data structure rr_node_indices.
This PR aims to eliminate the get_rr_node_indices() functions in the RRGraph builders, as one step further in deprecating the legacy data structure.

After this PR, the rr_node_indices data structure is only used in the verify_rr_node_indices() function:

vtr-verilog-to-routing/vpr/src/route/rr_graph2.cpp

Lines 1153 to 1156 in d3449e8

    
           bool verify_rr_node_indices(const DeviceGrid& grid, const t_rr_node_indices& rr_node_indices, const t_rr_graph_storage& rr_nodes) { 
        
               std::unordered_map<int, int> rr_node_counts; 
        
               auto& device_ctx = g_vpr_ctx.device(); 
        
               const auto& rr_graph = device_ctx.rr_graph;

The verify_rr_node_indices() may be an API of the RRGraphBuilder data structure, since it is a validator.

Checklist:

Added comments to API add_nodes_at_all_locs() as requested in PR Deploy RRGraphBuilder in RRGraph Reader and Writer to replace the use of rr_node_indices #1800
Added a new API find_grid_nodes_at_all_sides() to RRSpatialLookup and remove API find_sink_nodes()
Deprecate the get_rr_node_indices() functions
Remove the t_opin_connection_scratchpad and use local variables instead
Improve memory efficiency in RRSpatialLookup APIs (A lot of rework still needed)
Remove the use of scratchpad in timing-driver placer lookup builder to save memory.

Related Issue

Motivation and Context

This pull request is a follow-up PR on the routing resource graph refactoring effort #1801

How Has This Been Tested?

After the previous PR #1801 , we start reworking all the source files that use the legacy data structure rr_node_indices in a high priority, in order to deprecate the legacy data structure as soon as possible.
Current statistics on the files that use rr_node_indices (in total there are 143 lines related):

./route/router_lookahead_map_utils.cpp
./route/rr_graph.cpp
./route/rr_graph2.cpp
./route/rr_graph2.h

This PR will remove the use in

./route/router_lookahead_map_utils.cpp
./route/rr_graph.cpp

Types of changes

Bug fix (change which fixes an issue)
New feature (change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation
I have updated the documentation accordingly
I have added tests to cover my changes
All new and existing tests passed

…ient functions

…_grid_nodes_at_all_sides()

…d the use of get_rr_node_indices() functions

tangxifan · 2021-07-19T15:06:27Z

@vaughnbetz @hzeller This PR is ready for your review.

vaughnbetz · 2021-07-19T17:20:59Z

vpr/src/device/rr_graph_builder.h

    RRSpatialLookup& node_lookup();
    /* Add an existing rr_node in the node storage to the node look-up
+     * The node will be added to the lookup for every side it is on (for OPINs and IPINs) 
+     * and for every (x,y) location at which it exists (for wires that span more than one (x,y).


small typo (probably in my original comment -- stray bracket in "(for"

Sorry for my careless comments. Fixed now.

vaughnbetz

It seems like some of these data structures (the "label" ones for example) are really storing track numbers, not RRNodeIds. Hence I think some of the vector type changes are incorrect/reduce clarity. Please take a look at the detailed comments and see if you agree.

If they are indeed RRNodeId vectors, then we should still make a change to avoid RRNodeId(UN_SET) and instead use RRNodeId::Invalid().

vaughnbetz · 2021-07-19T17:38:09Z

vpr/src/route/rr_graph2.cpp

                }

-                if ((*incoming_wire_label[side_cw])[itrack] != UN_SET) {
+                if ((*incoming_wire_label[side_cw])[itrack] != RRNodeId(UN_SET)) {


This cast looks strange to me. Is this data structure storing an rr_node index (RRNodeId) or an integer that represents something else? If it is storing an RRNodeId, it seems like we should get rid of UN_SET and instead use RRNodeID::Invalid as the sentinel value for not set yet (and update the commenting to match).

Right now both RRNodeId::Invalid() and UN_SET are -1 I think, so it will all work, but it seems strange to define a separate UN_SET invalid sentinel.

If this data structure is indeed storing an RRNodeId it would be good to add a comment to that effect in this routine and in the data structure definition (unless there is one already, but I didn't find it). From a quick look at the code I couldn't quite figure out if this incoming_wire_label data structure is storing some kind of track index, or if it's a unique rr_node id. If you know the answer to that Xifan, it would be good to add a comment now.

vaughnbetz · 2021-07-19T17:43:42Z

vpr/src/route/rr_graph2.cpp

     * If seg_type_index == UNDEFINED, all segments in the channel are considered. Otherwise this routine
     * only looks at segments that belong to the specified segment type. */

-    std::vector<int>& labels = *labels_ptr;


Are these "labels" actually RRNodeIDs, or are they track numbers, or something else? Depending on the answer, they should stay ints, or be converted to RRNodeIds as this PR does. The comment should be updated to explain what a label is (is it an RRNodeId, a track number, or something else?

vaughnbetz · 2021-07-19T17:44:30Z

vpr/src/route/rr_graph2.cpp


    /* Alloc the list of labels for the tracks */
    labels.resize(max_chan_width);
-    std::fill(labels.begin(), labels.end(), UN_SET);


Same comment on UN_SET -- seems like it should be RRNodeId::Invalid() if this is actually storing an rr_node index. And if it isn't, we should change the type of the vector back to int.

vaughnbetz · 2021-07-19T17:46:02Z

vpr/src/route/rr_graph2.cpp

+    RRNodeId max_track = RRNodeId::INVALID();

    for (int i = 0; i < num_wire_muxes; i++) {
-        if (wire_mux_on_track[i] == from_track) {


It seems like label is a track number here (from 0 to W-1) so this implies "labels" are really "track numbers" and should be kept as ints, I think.

vaughnbetz · 2021-07-19T17:47:12Z

vpr/src/route/rr_graph2.h


 typedef vtr::NdMatrix<short, 6> t_sblock_pattern;

 struct t_opin_connections_scratchpad {


Good to comment what this is for, if you know. Why a dimension of 8?
Does it store RRNodeIds, or is it some tile structure that stores track numbers or some such (which would be better left as an int)?

tangxifan · 2021-07-19T21:52:41Z

Hi @vaughnbetz
Thanks for the constructive comments. I have the same feeling on the t_opin_connections_scratchpad when doing the refactoring.

My thoughts

After give a in-depth read on the codes, I think you are right.

The data type of the t_opin_connections_scratchpad should be int rather than RRNodeId. It is mainly used in the codebase to store the track index (0 ... max_chan_width - 1) rather than a valid node id.
It is weird that in some functions build_rr_sinks_sources() and get_opin_direct_connections(), it is used to store node ids which are required to create edges

To summarize, in current codes, its usage is mixed.

Why the t_opin_connections_scratchpad has 8 dimensions: I checked the codes, it seems that it contains two groups and each group contains 4 sides, representing the 4 sides of a switch block.

vtr-verilog-to-routing/vpr/src/route/rr_graph2.cpp

Line 2290 in d3449e8

VTR_ASSERT(scratchpad->scratch.size() == NUM_SIDES * 2);

The first group denotes the output ports of a switch block

vtr-verilog-to-routing/vpr/src/route/rr_graph2.cpp

Line 2366 in d3449e8

wire_mux_on_track[side] = &scratchpad->scratch[side];

The second group denotes the incoming wires, which are the input ports of a switch block

vtr-verilog-to-routing/vpr/src/route/rr_graph2.cpp

Line 2356 in d3449e8

incoming_wire_label[side] = &scratchpad->scratch[NUM_SIDES + side];

Detailed Analysis

The t_opin_connections_scratchpad is used in two code blocks:

The code block that creates unidirectional switch block patterns before allocating rr_graph:

vtr-verilog-to-routing/vpr/src/route/rr_graph.cpp

Lines 618 to 657 in d3449e8

    
           if (is_global_graph) { 
        
               switch_block_conn = alloc_and_load_switch_block_conn(1, SUBSET, 3); 
        
           } else if (BI_DIRECTIONAL == directionality) { 
        
               if (sb_type == CUSTOM) { 
        
                   sb_conn_map = alloc_and_load_switchblock_permutations(chan_details_x, chan_details_y, 
        
                                                                         grid, 
        
                                                                         switchblocks, &nodes_per_chan, directionality, 
        
                                                                         switchpoint_rand_state); 
        
               } else { 
        
                   switch_block_conn = alloc_and_load_switch_block_conn(max_chan_width, sb_type, Fs); 
        
               } 
        
           } else { 
        
               VTR_ASSERT(UNI_DIRECTIONAL == directionality); 
        
               if (sb_type == CUSTOM) { 
        
                   sb_conn_map = alloc_and_load_switchblock_permutations(chan_details_x, chan_details_y, 
        
                                                                         grid, 
        
                                                                         switchblocks, &nodes_per_chan, directionality, 
        
                                                                         switchpoint_rand_state); 
        
               } else { 
        
                   /* it looks like we get unbalanced muxing from this switch block code with Fs > 3 */ 
        
                   VTR_ASSERT(Fs == 3); 
        
                   t_opin_connections_scratchpad scratchpad; 
        
                   unidir_sb_pattern = alloc_sblock_pattern_lookup(grid, max_chan_width); 
        
                   for (size_t i = 0; i < grid.width() - 1; i++) { 
        
                       for (size_t j = 0; j < grid.height() - 1; j++) { 
        
                           load_sblock_pattern_lookup(i, j, grid, &nodes_per_chan, 
        
                                                      chan_details_x, chan_details_y, 
        
                                                      Fs, sb_type, unidir_sb_pattern, 
        
                                                      &scratchpad); 
        
                       } 
        
                   } 
        
                   if (getEchoEnabled() && isEchoFileEnabled(E_ECHO_SBLOCK_PATTERN)) { 
        
                       dump_sblock_pattern(unidir_sb_pattern, max_chan_width, grid, 
        
                                           getEchoFileName(E_ECHO_SBLOCK_PATTERN)); 
        
                   } 
        
               } 
        
           }

The code block that creates edges for unidirectional wires/OPINs when allocating rr_graph:

vtr-verilog-to-routing/vpr/src/route/rr_graph.cpp

Lines 1182 to 1198 in d3449e8

    
           t_opin_connections_scratchpad scratchpad; 
        
           /* If Fc gets clipped, this will be flagged to true */ 
        
           *Fc_clipped = false; 
        
           /* Connection SINKS and SOURCES to their pins. */ 
        
           for (size_t i = 0; i < grid.width(); ++i) { 
        
               for (size_t j = 0; j < grid.height(); ++j) { 
        
                   build_rr_sinks_sources(rr_graph_builder, i, j, L_rr_node, rr_edges_to_create, L_rr_node_indices, 
        
                                          delayless_switch, grid, &scratchpad); 
        
                   //Create the actual SOURCE->OPIN, IPIN->SINK edges 
        
                   uniquify_edges(rr_edges_to_create); 
        
                   alloc_and_load_edges(L_rr_node, rr_edges_to_create); 
        
                   rr_edges_to_create.clear(); 
        
               } 
        
           }

The following functions uses the scratch pad

build_rr_graph()
- load_sblock_pattern_lookup()
  - label_incoming_wires() This function always reset the scratchpad and refill
  - label_wire_muxes() This function always reset the scratchpad and refill
- alloc_and_load_rr_graph()
  - build_rr_sinks_sources() This function only assigns some RRNodeIds but never used later
  - build_bidir_rr_opins()
  - get_opin_direct_connections() This function always reset the 1st dimension of scratchpad and refill with some RRNodeIds. The node ids are used to create edges.
  - build_unidir_rr_opins()
    - get_unidir_opin_connections()
      - label_wire_muxes() This function always reset the scratchpad and refill
    - get_opin_direct_connections() This function always reset the 1st dimension of scratchpad and refill with some RRNodeIds. The node ids are used to create edges.
- build_rr_chan()
  - get_track_to_tracks()
    - get_unidir_track_to_chan_seg()
      - label_wire_muxes() This function always reset the 1st and 2nd dimensions of scratchpad and refill

Action items

I think that the scratchpad creates a lot of mess between functions.

Different functions use it for different data types.
It is always reset and refilled in functions. Previous results are not used. It means that it is not used to exchange data between functions. It is indeed a scratchpad.

Actually, my opinion in the refactoring

As scratchpad does not help in exchanging data. It should be a local variable in these functions
The scratchpad should use int as data type because it is used in switch block pattern generation.
The scratchpad in build_rr_sinks_sources() and get_opin_direct_connections() should be removed and replaced with a local vector of RRNodeId.

Let me know what you think. We can converge on the action items. I can do refactoring accordingly.

vaughnbetz · 2021-07-19T22:21:26Z

Thanks for the detailed analysis Xifan. I agree with your proposal -- making these local variables of the right type seems like the best approach.

…ded comments to clarify the use of scratchpad

tangxifan · 2021-07-19T23:21:54Z

@vaughnbetz Thanks for the input. I have remove the use of t_opin_connection_scratchpad as input arguments of functions, and also added comments to the t_opin_connection_scratchpad based on the analysis.

The PR is ready for your review.

…tion instead

tangxifan · 2021-07-20T04:07:06Z

Do not know why the sanity basic tests failed. I am looking into the problems. I will ping you when the CI is green. Then it is truely ready for code review.

…duce memory footprint

vaughnbetz · 2021-07-20T23:46:53Z

Maybe without the scratchpads we're doing huge amounts of memory traffic and memory fragmentation?

tangxifan · 2021-07-20T23:55:35Z

Maybe without the scratchpads we're doing huge amounts of memory traffic and memory fragmentation?

Yes. I checked the log files in the basic regression tests in sanity mode. It explodes the RAM size. In some tests, the peak memory usage is 6Gb, which caused the CI runner abort.

However, the scatchpad is only called in rr_graph builder but we did not see a sharp increase in the memory.
The placer really eats a lot of memory, far more than the rr_graph and router lookahead map.
I have tried to reproduce the error today with a Ubuntu machine. But it is weird that I cannot reproduce the error/memory usage on my machine.

I am trying to find out a solution about how to reduce the peak memory usage when debug mode is on.

Attached some lines from log file from https://github.com/verilog-to-routing/vtr-verilog-to-routing/pull/1805/checks?check_run_id=3118949531

<## Placement Quench took 0.40 seconds (max_rss 3763.9 MiB)
	<
	<BB estimate of min-dist (placement) wire length: 602
	<
	<Completed placement consistency check successfully.
	<
	<Swaps called: 154126
	<
	<Aborted Move Reasons:
	<  No moves aborted
	<Placement cost: 6.02316, bb_cost: 6.02316, td_cost: nan, 
	<
	<Placement resource usage:
	<  io     implemented as io    : 229
	<  clb    implemented as clb   : 72
	<  memory implemented as memory: 1
	<
	<Placement number of temperatures: 152
	<Placement total # of swap attempts: 154126
	<	Swaps accepted:  78707 (51.1 %)
	<	Swaps rejected:  75419 (48.9 %)
	<	Swaps aborted :      0 ( 0.0 %)
	<
	<
	<Percentage of different move types:
	<	Uniform move: 100.00 % (acc=51.07 %, rej=48.93 %, aborted=0.00 %)
	<	W. Centroid move: 0.00 % (acc=100.00 %, rej=0.00 %, aborted=0.00 %)
	<
	<Placement Quench timing analysis took 0 seconds (0 STA, 0 slack) (0 full updates: 0 setup, 0 hold, 0 combined).
	<Placement Total  timing analysis took 0 seconds (0 STA, 0 slack) (0 full updates: 0 setup, 0 hold, 0 combined).
	<update_td_costs: connections 0 nets 0 sum_nets 0 total 0
	<# Placement took 48.34 seconds (max_rss 3764.1 MiB, delta_rss +3344.4 MiB)
	<
	<# Routing
	<Initializing minimum channel width search using specified hint
	<
	<Attempting to route at 36 channels (binary search bounds: [-1, -1])
	<## Build routing resource graph
	<## Build routing resource graph took 4.63 seconds (max_rss 4229.8 MiB, delta_rss +463.2 MiB)
	<  RR Graph Nodes: 11337
	<  RR Graph Edges: 46620
	<Confirming router algorithm: TIMING_DRIVEN.
	<## Computing router lookahead map
	<### Computing wire lookahead
	<### Computing wire lookahead took 5.35 seconds (max_rss 4785.5 MiB, delta_rss +512.2 MiB)
	<### Computing src/opin lookahead
	<### Computing src/opin lookahead took 0.03 seconds (max_rss 4787.9 MiB, delta_rss +2.3 MiB)
	<## Computing router lookahead map took 5.38 seconds (max_rss 4787.9 MiB, delta_rss +514.6 MiB)

vaughnbetz · 2021-07-21T00:23:37Z

The placer builds the rr-graph to compute the place delay matrix (uses a router-like algorithm) so the rr-graph would be built there.
I suspect that's the largest memory use of the placer.

tangxifan · 2021-07-21T00:25:25Z

The placer builds the rr-graph to compute the place delay matrix (uses a router-like algorithm) so the rr-graph would be built there.
I suspect that's the largest memory use of the placer.

Yes. I am checking my previous PRs. I remember I did modify a function in placer, which may be source of these mess.

tangxifan · 2021-07-29T15:45:47Z

A todo list as a reminder before PR can be merged. Upload QoR comparison between the current branch and the VTR before refactoring (will create a branch based on current master and revert back a number of commits) on the following tests:

Titan benchmarks
VTR benchmarks
ch_intrinsics and diffeq1 in Sanity basic tests

tangxifan · 2021-08-10T03:20:07Z

Just tried the titian benchmarks on this PR. The gaussianblur benchmarks cannot be routed. I am going to run the titan benchmarks on an old master (before refactoring happens), with an aim to spot which PR caused the problem.

Attached the log file:

==========================================================================
                  Verilog-to-Routing Regression Testing
==========================================================================
           Running vtr_reg_weekly
--------------------------------------------
scripts/run_vtr_task.py -l /research/ece/lnis/USERS/tang/github/vtr-verilog-to-routing/vtr_flow/tasks/regression_tests/vtr_reg_weekly/task_list.txt -j 23 -script run_vtr_flow.py -short_task_names 

stratixiv_arch.timing/neuron_stratixiv_arch_timing		OK (took 1053.81 seconds)
stratixiv_arch.timing/stereo_vision_stratixiv_arch_timing		OK (took 1099.90 seconds)
stratixiv_arch.timing/sparcT1_core_stratixiv_arch_timing		OK (took 1616.09 seconds)
stratixiv_arch.timing/cholesky_mc_stratixiv_arch_timing		OK (took 1633.08 seconds)
stratixiv_arch.timing/SLAM_spheric_stratixiv_arch_timing		OK (took 2493.78 seconds)
stratixiv_arch.timing/des90_stratixiv_arch_timing		OK (took 3111.53 seconds)
stratixiv_arch.timing/dart_stratixiv_arch_timing		OK (took 3653.31 seconds)
stratixiv_arch.timing/segmentation_stratixiv_arch_timing		OK (took 4292.70 seconds)
stratixiv_arch.timing/openCV_stratixiv_arch_timing		OK (took 5096.52 seconds)
stratixiv_arch.timing/cholesky_bdti_stratixiv_arch_timing		OK (took 5212.57 seconds)
stratixiv_arch.timing/minres_stratixiv_arch_timing		OK (took 5303.49 seconds)
stratixiv_arch.timing/stap_qrd_stratixiv_arch_timing		OK (took 6374.38 seconds)
stratixiv_arch.timing/bitonic_mesh_stratixiv_arch_timing		OK (took 6379.55 seconds)
stratixiv_arch.timing/sparcT2_core_stratixiv_arch_timing		OK (took 8691.24 seconds)
stratixiv_arch.timing/denoise_stratixiv_arch_timing		OK (took 10788.75 seconds)
stratixiv_arch.timing/gsm_switch_stratixiv_arch_timing		OK (took 12355.49 seconds)
stratixiv_arch.timing/mes_noc_stratixiv_arch_timing		OK (took 16831.46 seconds)
stratixiv_arch.timing/LU_Network_stratixiv_arch_timing		OK (took 18483.60 seconds)
stratixiv_arch.timing/sparcT1_chip2_stratixiv_arch_timing		OK (took 22005.97 seconds)
stratixiv_arch.timing/bitcoin_miner_stratixiv_arch_timing		OK (took 38720.83 seconds)
stratixiv_arch.timing/directrf_stratixiv_arch_timing		OK (took 44879.60 seconds)
stratixiv_arch.timing/LU230_stratixiv_arch_timing		OK (took 85939.03 seconds)
stratixiv_arch.timing/gaussianblur_stratixiv_arch_timing		Error: Executable vpr failed
	full command:  /usr/bin/env time -v /research/ece/lnis/USERS/tang/github/vtr-verilog-to-routing/vpr/vpr stratixiv_arch.timing.xml gaussianblur_stratixiv_arch_timing --circuit_file gaussianblur_stratixiv_arch_timing.pre-vpr.blif --route_chan_width 300 --max_router_iterations 400 --router_lookahead map --inner_num 2 --astar_fac 1.0 --sdc_file /research/ece/lnis/USERS/tang/github/vtr-verilog-to-routing/vtr_flow/benchmarks/titan_blif/gaussianblur_stratixiv_arch_timing.sdc
	returncode  :  -15
	log file    :  /research/ece/lnis/USERS/tang/github/vtr-verilog-to-routing/vtr_flow/tasks/regression_tests/vtr_reg_weekly/vtr_reg_titan_he/run001/stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common/vpr.out
failed: Executable vpr failed (took 448194.47 seconds)
Elapsed time: 448194.79 seconds

Parsing test results...
scripts/parse_vtr_task.py -l /research/ece/lnis/USERS/tang/github/vtr-verilog-to-routing/vtr_flow/tasks/regression_tests/vtr_reg_weekly/task_list.txt
Elapsed time: 6.55 seconds

Calculating QoR results...

regression_tests/vtr_reg_weekly/vtr_reg_titan_he...[Fail]
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common vpr_status Task value 'exited with return code -15' does not match golden 'success'
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common logic_block_area_total relative value inf outside of range [0.8,1.3] and not equal to golden value: 0.0
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common logic_block_area_used relative value inf outside of range [0.8,1.3] and not equal to golden value: 0.0
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common routing_area_total relative value -4.2460501118834204e-10 outside of range [0.8,1.3] and not equal to golden value: 2355130000.0
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common routing_area_per_tile relative value -4.9558925562493803e-05 outside of range [0.8,1.3] and not equal to golden value: 20178.0
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common crit_path_route_time relative value 156.82601085663865 outside of range [0.1,10.0], above absolute threshold 2.0 and not equal to golden value: 2085.36
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common max_vpr_mem relative value -3.3827119066046776e-08 outside of range [0.8,1.2], above absolute threshold 102400.0 and not equal to golden value: 29562080.0
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common critical_path_delay relative value -0.0010828030305491218 outside of range [0.5,1.4] and not equal to golden value: 923.529
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common geomean_nonvirtual_intradomain_critical_path_delay relative value -0.0010828030305491218 outside of range [0.5,1.4] and not equal to golden value: 923.529
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common setup_TNS relative value 7.812622072219879e-09 outside of range [0.5,1.4] and not equal to golden value: -127998000.0
[Fail]
stratixiv_arch.timing.xml/gaussianblur_stratixiv_arch_timing.blif/common setup_WNS relative value 0.0010839767638740896 outside of range [0.5,1.4] and not equal to golden value: -922.529

regression_tests/vtr_reg_weekly/vtr_reg_titan_he...[Fail]
[Fail]
stratixiv_arch.timing.xml/LU_Network_stratixiv_arch_timing.blif/common placed_CPD_est relative value 1.4130971947890367 outside of range [0.5,1.4] and not equal to golden value: 6.33357
[Fail]
stratixiv_arch.timing.xml/LU_Network_stratixiv_arch_timing.blif/common placed_setup_WNS_est relative value 1.4905494818667422 outside of range [0.5,1.4] and not equal to golden value: -5.33357
[Fail]
stratixiv_arch.timing.xml/LU_Network_stratixiv_arch_timing.blif/common critical_path_delay relative value 1.4059740225349875 outside of range [0.5,1.4] and not equal to golden value: 6.79743
[Fail]
stratixiv_arch.timing.xml/LU_Network_stratixiv_arch_timing.blif/common setup_WNS relative value 1.476000572667544 outside of range [0.5,1.4] and not equal to golden value: -5.79743

Test 'vtr_reg_weekly' had 15 qor test failures

Test 'vtr_reg_weekly' had 1 run failures

Error: 16 tests failed

vaughnbetz · 2021-08-10T16:04:42Z

@tangxifan : seems like two issues:

LU_network slowed down by 41% (outside 40% critical path tolerance).
Gaussian_blur failed to route. I think gaussian_blur can sometimes fail to route, so this may be seed noise.
Probably worth running a different seed to see if it passes, and if LU_Network is within the QoR bounds, as well.

tangxifan · 2021-08-10T17:51:38Z

@vaughnbetz Got it. Let me try to rerun and if the seed noise is indeed a problem. I will keep you posted.

Meanwhile, I have finished the QoR test on the VTR benchmark.
A short summary

No critical path delay degradation (as expected)
No change in peak memory usage
Reduced total flow runtime but see an average of 1-4% increase in runtime for placer and router (In most benchmarks, the place&route runtime is improved. The only glitch is on the sha with a 40% increase in runtime on place&route).

vtr_reg_qor_chain_depop_comp.xlsx

tangxifan · 2021-08-13T23:29:19Z

Just finished the QoR check on the titan_quick_qor test case

A short summary

No critical path delay changes
On average 1% reduction in peak memory usage after refactoring
On average 4% runtime increase (3% on pack time and 1% on place time)

Details can be found in the attached spreadsheet

vtr_reg_titan_quick_qor_comp.xlsx

tangxifan · 2021-08-14T01:26:05Z

Finished the basic sanity tests.
A short summary

The peak memory usage is increased by 3% on average.
Runtime is increased by 1% on average
The peak memory usage for the two biggest design ch_instrinsics and diffeq1 is 486MB and 627MB respectively. They are in range.

Details can be found in the attached spreadsheet

vtr_reg_basic_timing_sanity_comp.xlsx

tangxifan · 2021-08-14T01:27:27Z

@vaughnbetz As suggested, I have completed the QoR checks on Titan, VtR and sanity basic benchmarking. No changes on peak memory usage. Small changes on runtime is observed. See if it is good to go.

vaughnbetz · 2021-08-15T18:22:53Z

Looks good; thanks @tangxifan . LU_Network didn't finish on the titan_quick_qor_test, but since it is tested in CI and CI is green, that must be a transient issue.

tangxifan added 6 commits July 18, 2021 17:14

[VPR] Add comments about API in rr_graph_builder

8841e5a

[VPR] Added a new API find_grid_nodes_at_all_sides() and deploy to cl…

60e209a

…ient functions

[VPR] Remove find_sink_nodes() API because it is a subset of API find…

1d63c47

…_grid_nodes_at_all_sides()

[VPR] Now t_rr_edge_set data structure adapts RRNodeId; Fully shadowe…

b0e37cf

…d the use of get_rr_node_indices() functions

[VPR] Remove out-of-date functions related to rr_node_indices

5581815

[VPR] Code format fix

aa32abd

github-actions bot added the VPR VPR FPGA Placement & Routing Tool label Jul 19, 2021

tangxifan requested review from hzeller and vaughnbetz July 19, 2021 15:06

vaughnbetz reviewed Jul 19, 2021

View reviewed changes

[VPR] Fix small typo in RRGraphBuilder comments

9074d45

vaughnbetz requested changes Jul 19, 2021

View reviewed changes

[VPR] Remove the use of scratchpad as a variable across functions; Ad…

29383fb

…ded comments to clarify the use of scratchpad

[VPR] Remove t_opin_connections_scratchpad; Use natural vector defini…

87112cc

…tion instead

tangxifan added 2 commits July 19, 2021 23:04

[VPR] Try to fix the error seen in sanity checks through using STL

4452b2f

[VPR] See RAM explosion on CI when sanity check is enabled. Try to re…

61670b4

…duce memory footprint

tangxifan added 5 commits July 20, 2021 18:28

[VPR] Reserve memory in RRSpatialLookUp to avoid memory fragement

a72a771

[VPR] Remove the use of scratchpad and vectors in timing placer lookup

4d40785

[VPR] Code format fix

2023e27

[VPR] Add reserve_nodes() API to RRSpatialLookup for memory efficiency

9295ac8

[VPR] Remove the use of SIDES[0] when adding a node to RRSpatialLookup

5f4247a

ArashAhmadian mentioned this pull request Jul 29, 2021

Adding regression_mcnc & vtr_reg_multiclock to CI #1807

Merged

7 tasks

vaughnbetz merged commit 1a00ea9 into master Aug 15, 2021

vaughnbetz deleted the get_rr_node_indices branch August 15, 2021 18:22

tangxifan mentioned this pull request Aug 16, 2021

Remove legacy data structure rr_node_indices from DeviceContext #1828

Merged

11 tasks

tangxifan mentioned this pull request Sep 13, 2021

Add a new API set_node_type() to RRGraphBuilder #1847

Merged

10 tasks

hamzakhan-rs mentioned this pull request Sep 17, 2021

Add a new API set_node_coordinates() to RRGraphBuilder #1853

Merged

11 tasks

m-hariszafar mentioned this pull request Sep 22, 2021

Add a new API set_node_direction() to RRGraphBuilder #1854

Closed

11 tasks

hamzakhan-rs mentioned this pull request Sep 22, 2021

Add a new API set_node_capacity() to RRGraphBuilder #1855

Merged

11 tasks

m-hariszafar mentioned this pull request Sep 23, 2021

Add a new API set_node_direction() to RRGraphBuilder #1856

Merged

11 tasks

m-hariszafar mentioned this pull request Oct 1, 2021

Add a new API set_node_ptc_num() to RRGraphBuilder #1865

Closed

11 tasks

This was referenced Oct 13, 2021

Add a new API set_node_x_num() to RRGraphBuilder #1872

Merged

Add a new API set_node_cost_index() to RRGraphBuilder #1884

Merged

m-hariszafar mentioned this pull request Nov 1, 2021

Add a new APIs reserve_nodes() and resize_nodes to RRGraphBuilder #1905

Merged

11 tasks

	bool verify_rr_node_indices(const DeviceGrid& grid, const t_rr_node_indices& rr_node_indices, const t_rr_graph_storage& rr_nodes) {
	std::unordered_map<int, int> rr_node_counts;
	auto& device_ctx = g_vpr_ctx.device();
	const auto& rr_graph = device_ctx.rr_graph;


		typedef vtr::NdMatrix<short, 6> t_sblock_pattern;

		struct t_opin_connections_scratchpad {

Uh oh!

Deprecate get_rr_node_indices() functions #1805

Deprecate get_rr_node_indices() functions #1805

Uh oh!

Conversation

tangxifan commented Jul 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

Uh oh!

tangxifan commented Jul 19, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vaughnbetz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tangxifan commented Jul 19, 2021

My thoughts

Detailed Analysis

Action items

Uh oh!

vaughnbetz commented Jul 19, 2021

Uh oh!

tangxifan commented Jul 19, 2021

Uh oh!

tangxifan commented Jul 20, 2021

Uh oh!

vaughnbetz commented Jul 20, 2021

Uh oh!

tangxifan commented Jul 20, 2021

Uh oh!

vaughnbetz commented Jul 21, 2021

Uh oh!

tangxifan commented Jul 21, 2021

Uh oh!

tangxifan commented Jul 29, 2021

Uh oh!

tangxifan commented Aug 10, 2021

Uh oh!

vaughnbetz commented Aug 10, 2021

Uh oh!

tangxifan commented Aug 10, 2021

Uh oh!

tangxifan commented Aug 13, 2021

Uh oh!

tangxifan commented Aug 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tangxifan commented Aug 14, 2021

Uh oh!

vaughnbetz commented Aug 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Deprecate `get_rr_node_indices()` functions #1805

Deprecate `get_rr_node_indices()` functions #1805

tangxifan commented Jul 19, 2021 •

edited

Loading

tangxifan commented Aug 14, 2021 •

edited

Loading